On Probabilistic Models for Uncertain Sequential Pattern Mining
نویسندگان
چکیده
We study uncertainty models in sequential pattern mining. We consider situations where there is uncertainty either about a source or an event. We show that both these types of uncertainties could be modelled using probabilistic databases, and give possible-worlds semantics for both. We then describe ”interestingness” criteria based on two notions of frequentness (previously studied for frequent itemset mining) namely expected support [C. Aggarwal et al. KDD’09;Chui et al., PAKDD’07,’08] and probabilistic frequentness [Bernecker et al., KDD’09]. We study the interestingness criteria from a complexity-theoretic perspective, and show that in case of source-level uncertainty, evaluating probabilistic frequentness is #P-complete, and thus no polynomial time algorithms are likely to exist, but evaluate the interestingness predicate in polynomial time in the remaining cases.
منابع مشابه
Mining sequential patterns from probabilistic data
Sequential Pattern Mining (SPM) is an important data mining problem. Although it is assumed in classical SPM that the data to be mined is deterministic, it is now recognized that data obtained from a wide variety of data sources is inherently noisy or uncertain, such as data from sensors or data being collected from the web from different (potentially conflicting) data sources. Probabilistic da...
متن کاملDistributed Sequential Pattern Mining in Large Scale Uncertain Databases
While sequential pattern mining (SPM) is an import application in uncertain databases, it is challenging in efficiency and scalability. In this paper, we develop a dynamic programming (DP) approach to mine probabilistic frequent sequential patterns in distributed computing platform Spark. Directly applying the DP method to Spark is impractical because its memory-consuming characteristic may cau...
متن کاملUncertainty in Sequential Pattern Mining
We study uncertainty models in sequential pattern mining. We discuss some kinds of uncertainties that could exist in data, and show how these uncertainties can be modelled using probabilistic databases. We then obtain possible world semantics for them and show how frequent sequences could be mined using the probabilistic frequentness measure.
متن کاملDiscovering Probabilistic Frequent Sequential Patterns in Uncertain Databases under Systolic Tree
Uncertain data are intrinsic in many real-world applications such as mobile tracking and environment surveillance. Mining sequential patterns from imprecise data, such as those data arising from GPS trajectories and sensor readings are important for discovering hidden knowledge in such applications. We establish two uncertain sequence data models abstracted from many real-life applications invo...
متن کاملProbabilistic Deterministic Classifier Based Sequential Pattern Mining to Evaluate Structural Pattern on Chemical Bonding
ISSN: 2347-8578 www.ijcstjournal.org Page 223 Probabilistic Deterministic Classifier Based Sequential Pattern Mining to Evaluate Structural Pattern on Chemical Bonding S.Sathya , N.Rajendran [2] Research Scholar , Bharathiar University, Coimbatore Principal ,Vivekanandha arts & science college,Sankiri,Salem(dt) India ABSTRACT Evaluating the structural patterns of chemical bonding involves ident...
متن کامل